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(54) Title: ADAPTIVE FILTER FOR A DIRECT SEQUENCE SPREAD SPECTRUM RECEIVER 
(57) Abstract 

A direct sequence code division multi- 
ple access (DS-CDMA) receiver comprises an 
adaptive filter (8) controlled by an adaptive al- 
gorithm (10) for filtering data which has been 
multiplied at (2) by a spreading code, the filter 
having a length equal to the number of chips in 
the code, and a multiuser detector (14) operat- 
ing on the output of the adaptive filter. Prefer- 
ably, either the fast a-posteriori error sequential 
technique (FAEST) algorithm or the stabilised 
FAEST (SFAEST) algorithm is used as the al- 
gorithm (10). 
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ADAPTIVE FILTER FOR A DIRECT SEQUENCE SPREAD SPECTRUM RECEIVER 

Background to the Invention 

The present invention relates to a telecommunications 
receiver employing a new direct sequence code division 
5 multiple access (DS-CDMA) architecture which allows the use 
of fast adaptive algorithms. 

Two adaptive algorithms are commonly in use, the LMS 
and RLS Algorithms 1 ' 2 ' 3 and these are described in Appendix I. 

The least mean square (LMS) algorithm (and the closely 
10 related normalised least mean squares (NLMS) algorithm) is 
a stochastic gradient algorithm which has only one 
parameter, the step size /x. The LMS algorithm is 
computationally simple but its convergence rate is slow and 
highly dependent on the properties of the input signal, more 
15 specifically on the eigenvalue ratio of the autocorrelation 
matrix. When many elements of the input signal are unknown, 
for example the channel in a mobile communications system, 
it is difficult to choose fi. The algorithm is numerically 
stable, but an inappropriate choice of \i can cause 
20 instability. In high noise conditions, the eigenvalue ratio 
of the autocorrelation matrix is low and this can help with 
convergence . 

The recursive least squares (RLS) algorithm is 
computationally much more complex, but has much faster 

25 convergence than the LMS algorithm. It has two parameters, 
the forgetting factor X and the initial diagonal matrix term 
6. The forgetting factor is set appropriate to the rate of 
change of the autocorrelation of the. input signal. The 
diagonal term has little effect on the algorithm once 

30 converged, but does affect the size of internal variables 
within the algorithm during initial convergence. The RLS 
algorithm is usually considered converged within a number of 
iterations equal to twice the filter length, which is 
generally much faster than the LMS algorithm. The RLS 
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algorithm can become numerically unstable when the 
autocorrelation matrix of the input signal is close to being 
a singular matrix. 

There are a few much less common adaptive filter 
5 algorithms and the use of these algorithms has been found 
desirable. The fast a-posteriori error sequential technique 
(FAEST) 5 ' 6 algorithm and its stabilised version the SFAEST 7 , 
which are also described in Appendix I, have a convergence 
rate close to the RLS algorithm but complexity close to the 

10 LMS algorithm. They do however impose an additional 
constraint: the input signal must have a shift invariant 
property. The shift invariant property simply means that 
the input signal must be the same as the input signal on the 
previous iteration shifted on by one sample, with only one 

15 new sample. This property is not satisfied by the 
conventional architecture for a minimum mean square error 
(MMSE) receiver for a DS-CDMA system 8 . The numerical 

stability of the FAEST algorithms is not as well understood 
as for LMS and RLS, but in practice the SFAEST algorithm 

2 0 seems to remain stable for a sufficiently long period of 

time for the purpose proposed here. The Fast Newton 
algorithm (see Appendix I) is an algorithm which can 
simplify the calculation of any of the above adaptive filter 
algorithms if the input signal can be modelled as an 
25 autoregressive filter with order less than is assumed by the 
above filters. 

The conventional architecture for the uplink and 
downlink of a DS-CDMA system with an adaptive filter 
receiver is shown in Figure 1. In this architecture, the 

3 0 training of an adaptive FIR filter 1 of length N+P-l chips 

(N being the number of chips per data bit and P the total 
number of chips in the code) is done at the bit rate, using 
an adaption error found by the algorithm to be the 
difference between data from a particular user and a sampled 
35 estimate of the data from the output of the filter 1, i.e. 
the filter has an effective training path ETP. The contents 
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of the filter 1 change completely from one iteration thereof 
to the next. This means that convergence is slow and it is 
not possible to use the FAEST or SFAEST algorithm because 
the shift invariance property is not satisfied. With the 
5 LMS algorithm, convergence is too slow and the time taken to 
reconverge when a user switches on or off is far too slow. 
This architecture does work reasonably well with the RLS 
algorithm, although convergence is still not very rapid and 
the computational complexity is very high. 

10 Summary of the Invention 

It is one aim of the present invention to provide a 
DS-CDMA receiver using an adaptive filter in which the 
convergence is rapid. It is another aim of the invention to 
allow the use of the less common adaptive algorithms which 
15 has not hitherto been possible. 

The present invention provides a direct sequence code 
division multiple access (DS-CDMA) receiver comprising an 
adaptive filter controlled by an adaptive algorithm for 
filtering data which has been multiplied by a spreading code 
20 and filtered by a channel filter, the adaptive filter having 
a length appropriate to model the inverse of the channel 
filter, and a multiuser detector operating on the output of 
the adaptive filter. 

The algorithm is preferably either trained using the 
25 spread-multiplied signal of a desired user only, or from a 
composite signal which is the sum of the spread-multiplied 
signals of more than one, for example all, transmitting 
users. This means that the adaptive filter will be trained 
by new information at the chip rate of the code. 

30 In a particular embodiment of the invention the fixed 

multiuser detector is of the minimum mean squared error 
(MMSE) type, but it may alternatively be of the zero forcing 
(decorrelating) , Volterra, Radial Basis function, 
cancellation, near optimum or other decoding types. 
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The algorithm can for example comprise the least mean 
squares (LMS) or recursive least squares (RLS) algorithm. 
However, because the adaptive filter of the invention 
satisfies the shift invariance property, the algorithm may 
5 alternatively comprise the fast a-posterion error sequential 
technique (FAEST) algorithm, the stabilised FAEST (SFAEST) 
algorithm, and the above algorithms or others may be used in 
combination with the Fast Newton algorithm. 

Brrief Description of the Drawings 

10 In order that the present invention may be more 

readily understood, reference will now be made, by way of 
example only, to the accompanying drawings, in which: - 

Figure 1 shows the conventional DS - CDMA architecture 
already discussed; 

15 Figure 2 shows DS-CDMA architecture according to an 

embodiment of the invention; 

Figures 3, 4 and 5 are graphs of simulation results, 
showing respectively the comparative convergence of the 
architectures of Figures 1 and 2, the relative convergence 
20 rates of different algorithms using the architecture of 
Figure 2, and the bit error ratio (BER) results for the 
architecture of Figure 2; 

Figure 6 schematically shows a DS-CDMA system with no 
channel model; 

25 Figures 7 and 8 are graphs showing signal to noise 

performance of a Wiener filter calculated for 7 chip and 31 
chip Gold codes respectively; 

Figures 9 and 10 are graphs showing bit error rate 
(BER) perfomance of the Wiener filter calculated for 7 chip 
30 and 31 chip Gold codes respectively; 
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Figure 11 is a graph showing convergence properties of 
the LMS and RLS adaptive filters; 

Figure 12 is a graph showing BER performance of the 
LMS and RMS algoritms after 1000 iterations allowed for 
5 convergence, compared with the Wiener optimal and a matched 
filter with no MAI; 

Figures 13 and 14 are graphs plotting BER against 
number of active users in an AWGN channel and a stationary 
multipath channel respectively, all users being equal power 
10 and the spreading code length being 7; 

Figure 15 schematically shows the construction of a 
received signal Y(n) ; 

Figures 16 a) to d) schematically show the structures 
of a matched filter, a parallel canceller using matched 
15 filters, a Wiener filter and a parallel canceller using 
Wiener filters respectively; 

Figure 17 is a graph showing the BER performance of 
the filters shown in Figures 16 a) to d) ; and 

Figures 18 a) to d) are graphs plotting simulated BER 
20 averaged over all users against number of users for four 
differnt signal to additive Gaussian noise ratios, with 
60,000 data bits per user and a sequence length of 64. 

Deatiled Description of the Preferred Embodiments 

25 Figure 2 shows DS-CDMA transmitter and receiver 

architecture comprising spreading means 2 respectively 
operated by each user in which a data signal from the user 
is multiplied by one of a set of spreading codes uniquely 
allocated to the respective user. The data is supplied to 

3 0 the spreading means at its bit rate and the code is input at 
its chip rate, there being N code chips for each data bit. 
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The spread-multiplied data is summed at 4 and filtered 
in a finite impulse response (FIR) channel filter 6 of 
length P chips, the output of which is transmitted, with 
incidental Gaussian noise, to a receiver comprising an 
5 adaptive FIR filter 8, also of length P chips. 

An adaptive algorithm 10 controls and trains the 
adaptive filter 8. The algorithm 10 basically finds an 
error signal which is the difference between (a) either the 
spread-multiplied data from the desired user only or the 
10 composite spread-multiplied signal, and (b) the output of 
the adaptive filter 8. The choice of training data in (a) 
can either be preset or switchable. The effective training 
paths ETP for the adaptive filter, representing this 
operation of the algorithm are shown in Figure 2 . 

15 The adaptive filter 8 acts similarly to (but not 

exactly the same as) a conventional equaliser. As 
previously stated, training can be on the desired user's 
signal only, or on the composite chip rate signal. The 
filter 8 has less taps than the adaptive filter 1 in the 

20 conventional architecture and is trained at the chip rate, 
which means much faster convergence. The adaptive filter 8 
also satisfies the shift invariance property required for 
the FAEST and SFAEST algorithm and this allows LMS 
complexity levels with RLS convergence rates. A fixed 

25 multiuser detector 14 is precalculated and stored at the 
receiver and is based on knowledge obtained about the number 
of users in the system and their spreading codes . This 
detector 14 can be of the MMSE type 9 (used in the comparisons 
here) or it can be of a different type, for example zero 

30 forcing or decorrelating 10, Voterra 11 ' Radial Basis function 11 , 
cancellation based 13, 14 or near optimum decoding based on 
Viterbi decoding 15 . 

Summary of Multiuser Detection Techniques 



The conventional single user detector or spreading 
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code matched filter calculates the correlation between the 
received signal and the spreading code (or the spreading 
code convolved with the channel impulse response in a 
multipath channel) over the data bit period. The multiple 
5 access interference (MAI) impacts the ability to recover the 
desired transmissions with this receiver and is more 
pronounced for high power interfering users. 

The matched filter is the optimum detector in AWGN but 
the MAI cross-correlation terms are not Gaussian unless 
10 there are many active users. In this situation the optimal 
detector is not the conventional matched filter but rather 
a form of multi-user detector. 

The detector that yields the most likely transmitted 
sequence maximizes the probability that it was transmitted. 

15 When all possible transmitted sequences are equally 
probable, this is the maximum likelihood sequence estimator 
(MLSE) . When implemented with a Viterbi algorithm the 
complexity is exponential in the number of users. The MLSE 
must also estimate the received amplitudes and phases but it 

20 lowers the mobile user transmitter power control accuracy 
requirement. Despite the performance and capacity gains 
over conventional detection, the MLSE is not a practical 
solution. 

Here the zero forcing or decorrelating detector 
25 implements an inverse of the received signal autocorrelation 
matrix (neglecting the noise) such that the output is 
completely free of multiple access interference. The 
decorrelating detector was initially proposed in the late 
1970s and it provides substantial performance/capacity gains 
3 0 over the conventional detector and does not need to know in 
advance the received signal amplitudes, avoiding sensitivity 
to estimation error. It has computational complexity 
significantly lower than MLSE. Another desirable feature is 
that it corresponds to the MLSE when the user energies are 
35 unknown. A disadvantage of this detector is that it causes 
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noise enhancement i.e. the power associated with the noise 
term at the output of the decorrelating detector is always 
greater or equal to the noise term at the output of the 
conventional detector. A more significant disadvantage of 
5 the decorrelating detector is that the computations needed 
to invert the matrix are considerable. 

A possibly superior linear receiver structure is to 
build an adaptive filter that minimises (at its output) the 
error power. This implements a partial or modified inverse 

10 of the autocorrelation matrix, dependent on the level of 
background noise, to balance the desire to decouple the 
users for MAI reduction while not enhancing the background 
noise. Again this receiver structure implements a matrix 
inversion operation and here we can apply the recursive 

15 adaptive filter techniques. This detector differs from the 
decorrelating detector in that it takes the noise terms into 
account to minimise the noise enhancement. Thus if noise is 
low the minimum mean squared error (MMSE) receiver 
approaches the decorrelating detector and achieves close to 

20 optimum performance. On the other hand, if the multiple 
access interference is small compared to the noise, then the 
matched filter detector solution is approached. This is 
exactly analogous to the MMSE linear equalizer used to 
combat inter-symbol interference. Unlike the decorrelating 

25 detector, it requires estimation of the received amplitudes 
but its complexity is independent of number of active users 
and explicit knowledge of the CDMA spreading sequences is 
not required for an adaptive implementation. 

3 0 Other researchers have investigated radial basis 

function (which in its complete implementation is similar to 
MLSE, various simplifications have been suggested) and 
Volterra nonlinear approaches (where the received signal is 
passed through a power series nonlinear expansion before 

35 applying a linear filter eg. decorrelating or MMSE) . Note 
that if a MMSE error type fixed multiuser detector is used, 
the overall impulse response of the adaptive filter followed 
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by the fixed detector is not exactly the same as for the 
conventional architecture even when both systems are fully 
converged. This will give different bit error ratio (BER) 
results even when both the adaptive filters in the two 
5 receivers are fully converged. Note also that the adaptive 
filter 8 has an extremely low signal to noise ratio at its 
output as the processing gain has not been applied at this 
stage. This leads to a low eigenvalue spread which can help 
the convergence and stability of some adaptive algorithms 
10 but others exhibit instability in the extremely high noise 
conditions 

Possible multiuser detectors are discussed in more 
detail in Appendix II. 

The output from the detector 14 is down- sampled to the 
15 bit rate at the synchronous points to give the required 
estimate of the user's data signal. 

Simulation Results 

All simulation results presented below are for 16 chip 
spreading codes (i.e. P=16) and a six tap stationary 
20 channel. Firstly we show that the new architecture has far 
faster or convergence than the conventional one. Figure 3 
shows convergence curves for the conventional architecture 
of Figure 1 and the new architecture of Figure 2 . The new 
architecture is much faster than the conventional one, 

2 5 mostly because it is trained at the chip rate instead of the 

bit rate but also because the eigenvalue ratio of the 
autocorrelation is reduced in high noise. 

The graph of Figure 3 shows ensemble averaged squared 
error at the output of the adaptive filter in both 

3 0 architectures plotted against time measured in chips, for 

the single user case. Both curves use the LMS algorithm 
with the value of fi individually optimised for each. The 
use of chip rate training instead of bit rate training, 
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increases the convergence by a factor of 16, but in fact the 
convergence is more than 16 times faster because high noise 
reduces the eigenvalue ratio in the autocorrelation. The 
new algorithm converges to a MMSE 16 times higher than the 
5 conventional architecture, but this loss will be largely 
regained through the fixed multiuser detector. 

Figure 4 shows the relative convergence rates of some 
different adaptive filter algorithms using the architecture 
of Figure 2 . The use of the SFAEST algorithm is only 
10 possible in the new architecture, as the conventional one 
does not satisfy the shift invariance property. 

Relatively speaking, the LMS algorithm is very slow. 
It can be made faster by increasing the value of fi but when 
this value is increased too far, the LMS algorithm does not 
15 converge to the MMSE error floor (misadjustment error) . The 
RLS is fast, but at the price of high computational com- 
plexity. The SFAEST algorithm is as quick as the RLS 
algorithm if it is initialised appropriately. 

Figure 5 shows BER results for the new architecture 
2 0 after allowing 160 chips (only 10 databits) for convergence 
of the adaptive filter. 

With only 160 databits for convergence, the LMS 
algorithm is not fully converged when the training period 
ends and this results in a slightly poorer performance than 
25 is obtainable with the conventional architecture. The RLS 
and SFAEST give similar results. 

The architecture of the receiver of the invention is 
much faster converging and tracking than the conventional 
architecture with only a small loss in BER performance for 
30 a stationary channel. In a time varying channel, average 
BER will be better for the new architecture because of 
reduced tracking errors . The fast adaptive algorithms 
(FAEST, SFAEST, preferably in combination with Fast Newton) 
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allow the new architecture to achieve good convergence 
without the computational complexity of the RLS algorithm. 

The receiver of the invention could be integrated in 
a "hard- wired" form or could be made capable of being 
5 updated by using reconf igurable or replaceable firmware. 
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APPENDIX I 



Summaries of Adaptive Algorithms 



Least mean square algorithm 

From the stochastic gradient approach [l] one obtains 



5 the well known least mean squares (LMS) algorithm which was 
first introduced in 1960 by Widrow and Hoff [2] . This 
algorithm is uncomplicated and yields acceptable performance 
in most cases. The LMS technique is probably the most 
frequently used adaptive algorithm in current communication 
10 system. It is derived by applying the method of steepest 
descent minimisation to the Wiener-Hopf equations which 
define the optimum Wiener filter and one obtains a simple 
recursive scheme of the form [3] 



where the new information consists of the product of the 
15 filter input vector and the error signal, i.e. the 
difference between the desired filter output and the actual 
output of the filter. Essentially one can express the LMS 
algorithm (and many other adaptive algorithms) by means of 
three expressions [1] : 



where x is the filter input vector, w r the transposed tap 
weight vector, d represents the desired filter output and \i 
25 is the learning rate (step size parameter) . 

Several modifications have been made to improve the 
LMS algorithm, the most important one resulted in the 
normalised LMS (NLMS [4]) which normalises the adaption 
error e using instantaneous estimates of the input vector 
3 0 power ||x|| 2 . Doing so largely improves the algorithm's 
stability characteristics and allows faster convergence. The 
LMS algorithm is computationally simple but its convergence 




(1) 



Filter output : y ( t ) = w(t) r x(*) 
Adaption error : e(t) = d(t) - y(t) 
Tap weight update : w(*-r 1) = ru{t) -r/ix(t)e(t) 



(2) 
(3) 
(4) 
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rate is slow and highly dependent on the properties of the 
input signal, more specifically on the eigenvalue ratio of 
the autocorrelation matrix. When many elements of the input 
signal are unknown, for example the channel in a mobile 
5 communications system, it is difficult to choose \i . The 
algorithm is numerically stable, but an inappropriate choice 
of /x can cause instability. In high noise conditions, the 
eigenvalue ratio of the autocorrelation matrix is low and 
this can help with convergence. Other versions of the LMS 
10 such as leaky, signed and quantised LMS [3] were developed 
but only the NLMS achieved sufficient stability at 
acceptable performance and is therefore considered as a 
candidate for application in advanced multiuser detection. 

To illustrate the improved performance of the 

15 normalised LMS as compared to the standard LMS we will 

present the performance of both the LMS and NLMS later in 
this document report. 

Recursive least square algorithm 

The most popular least-square (LS) technique is is 

20 probably the recursive least squares (RLS) algorithm. The 
RLS algorithm is computationally much more complex, but has 
much faster convergence than the LMS algorithm. It has two 
parameters, the forgetting factor X and the initialisation 
factor 6 for the diagonal matrix P(0), see Equations below. 

25 The forgetting factor is set appropriate to the rate of 
change of the autocorrelation of the input signal. The 
diagonal term has little effect on the algorithm once 
converged, but does effect the size of internal variables 
within the algorithm during initial convergence. The RLS 

3 0 algorithm is usually considered converged within a number of 
iterations equal to twice the filter length, which is 
generally much faster than the LMS algorithm. The RLS 
algorithm can become numerically unstable when the 
autocorrelation matrix of the input signal is close to being 

35 a singular matrix. 
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We can summarise the standard RLS algorithm as follows 

[1) : 

INITIALISATION 

p (o) = r l i 

w(0) = 0 

SUBSEQUENT ITERATIONS 
„ M - P(*-l)x(t) 

- A-rx3"(t)F(t-i)x(«) W 

e(0 !)**(«) (6) 
w(t) = w(*-l)+g(t)e(t) (7) 

*{t) = j{*[t-l)-*(t)x 7 (t)V(t-l)) (8) 
5 The vector g(t) is known as the gain vector of the 

algorithm, P(t) represents the inverse of the correlation 

matrix 

*(0 = ELiA^x(0x(t) r 

which can be computed in a recursive manner by 
means of the matrix inversion lemma. The variable e denotes 
10 the a-priori error of the filter estimation and the filter 
weights are again represented by the vector w. 



Fast a-posteriori error sequential technique (FAEST) 

The fast a-posteriori error sequential technique 
(FAEST) has first been reported by Carayannis et. al [5] in 

15 the context of least-square (LS) filtering. Compared to 
standard LS algorithms the FAEST uses a different approach 
for calculating the Kalman gain vector based on the 
a-posteriori error formulation rather than the a-priori 
error formulation as used in fast Kalman algorithms [6] . 

20 Assuming the shift invariance property of the input signal 
and introducing a slightly modified version of the Kalman 
gain, the FAEST algorithm manages to perform a direct 
updating of the Kalman gain without invoking matrix-vector 
multiplications . 



25 The Kalman gain update according to the FAEST 

algorithm is summarised in the Table 1. 
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Table 1: The FAEST algorithm. 
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Time update of gain vector 


Multiplications 


Divisions 


evW = i(t)-a: v (f-l)x,v(*-l) 


N 


- 


J ( t ) - f<d0 




1 


a'„{t) = \ a ' v (t-l) + e'„(t)si.(i) 


2 


- 




2 


1 


M 0 ,' v (t) 1 
wat+i 10 - _ WA , ( t _ i) _ " 7«*(t-l) _ a/V (t - 1) 


N+l 


1 


a,v (t) = o,v (t - 1) - sif (t) w,v (* - 1) 


N 


1 


evW = -Aa» v (t-l)w^L 1 (f) | 2 


_ 




1 


1 


7;V (t) = 0 (*) 7AT+1 (*) ! 1 


- 




1 


aV(«) = Aa , v(t-l) + e , v(*)«*v(t) 1 2 




" w w (t) _ frt _ 3*1' " -b*r (t - 1) 


/V 




by («) = b.v (< - 1) - e? v (t) Wtf (l) | jV 




Total number of multiplications and divisions j 5:V -f 11 5 



It is well known that the FAEST algorithm suffers from 
sever stability problems and we will therefore not describe 
this algorithm in more detail but focus on its stabilised 
5 version, the SFAEST . For more information about FAEST please 
refer to [5] . 

Stabilised FAEST 

The stabilised version of the FAEST algorithm has been 
derived by [7] and is presented in Table 3. For easier 
10 reading of the equations building the stabilised FAEST 
(SFAEST) algorithm, we will first list all variables 
occurring and explain their meaning with a few words, see 
Table 2. 



WO 00/51260 



PCT/GBOO/00649 



ft 

- 16 - Appendix I 

Table 2: Variable definitions of the SFAEST algorithm. 



variaoie iName 


Definition 


x{t) 


r uxer input at uiuc t 


y(t) 




X|V(t) 


Input vector |x(i), . . . , x(t — iv -r IJJ 




Uuai rvairn&n gain vecuui 


a,v(t) 


vOrwarci prccnciur cocuiciciho 


biv(t) 


D3.CKW3.fa predictor coeuicicuto 


n<v(f) 


r uier coemcient5 


«:V(t) 


A-priori filter error 






ev(*).«v(0- 


Forward predictor error and its corrected version 




Backward predictor error and its corrected version 


«'*(«) 


Forward prediction error power 




Backward prediction error power 


7iv(t),7/V+i(t) 


0 < 7AK0 < 1 


0 ( v(t) 




&v(*) 


Difference of errors and the corrected error difference 


A 


exp. forgetting factor 0 < A < 1 



Table 3 shows the successive steps of the SFAEST 
algorithm and also evaluates the complexity in terms of 
multiplications and divisions required. 



WO 00/51260 



PCT/GB00/00649 



- 17 - Appendix I 

Table 3: Tasks of SFAEST in order of computations 
and number of required MUL/DIV operations 



Tasks of SFAEST 


MUL 


DIV 


Available at time t: 

g,v(t - 1), a,v(t - 1), b jV (« - 1), h,v(* - 1), x,v(t - 1) 






New Information: 

*(*), y(t) 






Computation of the residuals and corrections: 
A-priori forward/ backward errors (residuals): 
1) eUt) = x(t) - «y(f - l) r x,y(t - 1) 
21 e\ r (t) ss xlt - N) - bM(t - l) T x,v(0 


N 
N 


- 


Normalisation parameters: 

3) 'iNi-i(t)-—, ia/v "*" i rw(r 1) 

4) MO = 1 + 7tf+i(<)«W*)(lff(< " 1) + 4^T) a 'V(' " D' V ) 

5) 7*ffl - ^# a°d 7*(*) = 1-^0) 

6) Jk iV (* - 1) = A-' v 7i v(t - l)a#(t - 1) 


4 

3 

2 


1 

1 
1 


Difference of errors: 

7) ftv(f) = e* v (t) + M< - l)e<r«) + Aa? v (* - l)g#(t - 1) 


3 




Difference of errors of corrected filters 

8) £,v(*) - i+/,(i-» w (t))+p*=(,(t-X)(l-7*r(«-l)) 


4 


1 


Corrected a-priori forward/ backward errors: 

9) ef v (r) = ej(*) - (1 - 7at(* - 1))M* - l)p?*r(*) 

10) af v (r) = Aaf v (i - 1) + 7 yv(r - l)(*jU0) 3 

11) = ef v (t) - (1 - 7* (*))rf,v(*) 

12) atv(«) = Aa 4 v (r - 1) 4- 7.v(r)(ef v (t)) 2 


3 

3 
2 

3 


_ 
- 


Time update of the duai Kalman gain g ; v(i) : 
Extended Kalman gain vector: 

ill tt\ f 0 *' v(t) f 1 
13) g,v+i(*)- giV (i_i) XaUt-i) -a, v (t-l) 


N + l 


1 


Forward filter: 

141 a,vf<) = a/vft - 1) - 7v(« - D(e v(0 + k*r(t - l)p?v(*))giv(f - 1) 


N + 4 




Dual Kalman gain: 

la) q =g,v+i(t) -g.vitl*) i 


N 




Backward filter: 

16) b,v(t) = bjv(t - 1) - A :V (r)(e»v(0 + />{,vW)g.vW 


N + Z 




Time update of the filter hx{t): 

18) e,v(t) = V(t) - h^(t)x,v(r) 

19) £tf(t)=7/v(t)*<v(t) 

20) h.v(t) = h,v(t - 1) - «Ar(«)giv(*) 


N 
1 

N 




| Total number of multiplications and divisions: 


8N + 36 
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Initialisation, stabilisation and optimisation issues 
related to the SFAEST 

The value of the forgetting factor X largely depends 
on the rate by which the input signal changes and defines 
5 the memory length of the algorithm. The eigenvalue spread of 
the weighted input covalence matrix given by 

R < v(0 = E^ rx ^ (r (9) 

plays an important part in determining X. A suitable value 
could for example be X = 0.98 but the type of the input 
signal needs to be considered and it is most likely that 
10 changing between static and dynamic users and static and 
dynamic channels will require different values for X. 

The choice for the variable p is not straight forward 
and no direct rule for calculating an appropriate value 
exists. In [7] it is mentioned that a initialisation can be 
15 done by means of an estimate such as 

p as p 0 ^ with po « 0.05. (10) 

1 — A 

This could however not be confirmed by simu' ations carried 
out in the course of this work. Usually a value of p = 1 has 
been chosen but an adaptive estimation of p during operation 
of the algorithm might prove advantageous. The most crucial 
20 parameters were found to be the forward/backward error 
powers. Initialisation has been done by means of the 
following rules: 

afv(0)=/^ v (11) 
*tv(0)=/i (12) 

As can be seen both error powers depend on the value /x and 
hence this parameter needs to be initialised with care. A 
25 typical, stable range for p. was found to be between 10 < fx 
< 100. From a stability point of view it is advantageous to 
monitor the evolution of the variable y N (t) . To prevent 
divergence of the algorithm, this variable should be 
restricted to values between 0 and 1. A recommended rule [7] 
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to assure convergence is to reinitialise the algorithm when 
the condition 



becomes true, where v is a small constant. This precaution 
will also take care of the case where the gain tends towards 
5 zero, hence y N -> 0 . 

Fast Newton transversal filter 

The Fast Newton algorithm is an algorithm which can 
simplify the calculation of any of the above adaptive filter 
algorithms if the input signal can be modelled as an 
10 autoregressive filter with order less than is assumed by the 
above filters. Fast Newton transversal filters originate 
from the area of speech enhancement and echo cancellation. 
Their main feature is a fast calculation of the gain vector 
as required in many LS adaptive algorithms . 

15 The stabilised fast Newton transversal filter (SFNTF) 

is essentially a computational accelerator for any least 
square (LS) algorithm. Operating as a "higher- level" 
adaptive predictor it can use any LS algorithm as a 
subroutine within its own algorithm. However, the order of 

20 this LS filter can be chosen to be smaller than the actual 
filter order of the SFNTF algorithm. A sophisticated 
predictor part then extrapolates the remaining filter 
coefficients to gain the complete set of filter coefficients 
as required according to the definition of the SFNTF length. 

25 It is this feature that should make the SFNTF a potentially 
attractive adaptive algorithm for many application and the 
usefulness of SFNTFs for advanced MUD receivers shall be 
examined in the future. As far as this report is concerned, 
we will however restrict ourselves to the description of the 

30 algorithm only. 



(13) 



To discuss the details of the SFNTF we will again list 
all variables as present in the algorithm with some words of 
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explanation about the meaning of the variables. Table 4 
lists these definitions. 

Table 4: Variable definitions of the SFNTF algorithm. 



Variable Name 


Definition 


x{t) 


Filter input at time t 


y{t) 


Filter output, y(t) = U vt < U 


x ; v(t) 


Input vector [x(t),...,x(t- iv + 1J 


SN{t) 


Dual Kalman gain vector of order jV 


qiV(*).qAr+i(t) 


temporary vectors to compute gjvlv l " version 




2 and 3. 




temporary vectors to compute giv(*)- 


a,v(0 


Forward predictor coefficients 


b.v(t) 


Backward predictor coefficients 


h : v(t) 


Filter coefficients 


R;V 


Input sample covariance matrix 


r.v 


Vector that builds R.v t see end of table. 


e ; v(() 


A-priori filter error 


«,v(0 


A-posteriori filter error 




Forward predictor error and its corrected version 


e%(t),r N {t) 


Backward predictor error and its corrected ver- 




sion 


a'v(*) 


Forward prediction error power 


a?v(«) 


Backward prediction error powerr 


7,v(t),7;v+i (0 


0<7/v(*)<l 


0,v(*) 




&r(<U.v(«) 


Difference of errors and the corrected error dif- 


ference 


A 


exp. forgetting factor 0 < A < 1 



We can now describe the SFNTF with all its equations. 
5 For easier reading of the following equations, we first 
define two new vectors s and u: 



,pw( , "AoJ( t -i)l-M(«-l) J 



-bp(t-i) 
i 



t° = t - N + P 



(14) 
(15) 



Comparing these definitions with Equation 4, we notice 
the similarity with the updating part /xxe. While X 
represents the forgetting factor similar to n, the second 
10 factor of Equations 14 and 15 are normalised error signals 
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and the third factor, containing the vectors a and b, are 
similar to the input vector x of Equation 4. Note however 
that this is only a basic attempt to explain the nature of 
the vectors s and u and not a completely valid comparison - 
5 in particular with respect to the third factor. The vectors 
a and b represent the forward and backward predictor 
coefficients of the FNTF, respectively, and are computed by 
means of the sample covariance matrix R D by 



a/>(0 = V(f-l) 



(16) 



and 



M0 = Rp (0 



x'(t) 



m x l (t - P -f 1) 



(IT) 



.0 Their corresponding predictor error powers can then be 

defined as 



af(«)»»°(«)-[^(0 ... z F (t))*p(t) 



and 



a b p(t) = * 0 (t-P)-[*i(t) ... x p (t-P+l) }b P (t) 



(18) 



(19) 



As we notice from the definitions above, the SFNTF 
operates two separate predictor branches and by enabling or 
5 disabling those branches we can create three different 
versions of the algorithm: 

Version 1: Using forward and backward predictors (not 
recommended) 



Available values at time t: 
0 From SFAEST algorithm: 



g,v(* - 1) and 7/V,/>(« - 1) 
sp +1 (0,ei(<),«P+i(* 0 ) and e b p (t°) 



(20) 

(21) 
(22) 
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Verson 2: Using only forward predictors 



Available at time t: q. v (t - 1), gv{t - 1) 

From SFAEST algorithm: s P+l {t), e f P (t), sp^it 0 ), e^t 0 ), S p{t°)np{i°) 



q/V+i(*) = 



Suit) = 



q^+iO) 



Ojv-p . 



',3- 



W(0 = PAf(t - 1) + sj, +l (t)e£(t) ~ 

7/v ( />(0 = 7p(* 0 ) + <7/v(0 



(23) 
(24) 

(25) 

(26) 
(27) 



Version 3 : Using only backward predictors 



'5 Available at time t: qA ,(* _ i),gx(t - 1) 

From SFAEST algorithm: u P+l (t), ej,(t), u P+ i(<°). sM'V/H* 0 ) 



q,v+i(<) = 



' o • 


r 


. q*(< - 1) . 


+ 



On-p 



0 



= q.v+i(<) - 



Rv(t) = 



SP(t) 
0/V-p 



0/v-p 
up +1 (t°) 

- q*r(t) 



= <7,v(t - 1) + uj«(t)£(t) - uf «(t°)4(«°) 
7<V.p(0 = 7P(*)+&v(0 

Time update of the filter h v (t) : 

Available at time t: h*/(t - l),x,v(* - 1) 
New information: *(*)»»(*) 



(28) 

(29) 

(30) 
(31) 



e*(t) = y(*)-h$(t-l)x <v (t-l) 
7;v,p(t) 

h,v(<) = h/v(« - 1) - <iv(t)giv(0 



(32) 
(33) 
(34) 



10 Predictor version 1 is more likely to diverge due to 

the fact that the calculation of y is not realised as a 
finite sum. Versions 2 and 3 include only a finite sum of 
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P + 1 values in the computation of 7. 

As the algorithm of Table 5 is rather complex and 
difficult to overview, we also include Table 6 which 
displays the variable occurences in the different equations 
of Table 5 and states which variables are required for 
updating other variables. 



x Note: The definitions of y differ between FAEST and FNTF ! 



♦ 

0 
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Table 5: Tasks of SFNTF with SFAEST in order of 



computation and number of required MUL/DIV operations 



SFNTF with SFAEST 


MUL 


Drv 


Available at time i. 
gp(t - 1), a P (f - 1), bp(t - 1), h iVft -i, xp(t - 1), 

7P(«-I),«^(t-D.«*p(t-1) 






New Information: 
xW, »(<) 






Computation of the residuals and corrections: 
A-priori forward/ backward errors (residuals) 

1) eUt) = x{t)-&p(t-l) H x P {t-l) 

2) f p (t)**x(t-N)-\>p(t-l) H xp(t) 


P 
P 


- 


Normalisation parameters 

3) 7^+1 (*)- "A«i(t-l)+,,(»-t)(ei(0) J ' 7,,( ' 

4) MO = i + 7P+i W«W*)(rf(* - 1) + sjjfe*^' ~ 

5) 7*M - ^ »*» 7^(0 - TZijteoI 

6) Jk P (t-l) = A- f, 7p(i-l)a^(r-D 


4 
3 

2 


1 
1 
1 


Difference of errors: 

7) (0 = e%(t) + M* - l)ep(t) + Aa* p (* - l)g£(t - 1) 


3 




Difference of errors of corrected filters 


4 


1 


Corrected a-priori forward/backward errors 

9) e f P (t) = e£(t) - (1 - Mt - 1))M* - l)p| P (<) 

10) a£(<) = Aa p (r - 1) + 7? (t - l)(? P (t)) 2 

11) e* P (r) = e P (t) - (1 - 7J>(t))/>?p(0 

12) a 6 pft) = Aa* P (r - 1) + 7p(t)(*p(*)) 2 


3 
6 
2 
3 


- 
- 


Time update 
Extended Kaln 

13) gp+u = 


of the duaJ Kalman gain gp(t): 
lan gain vector: 

f 0 1 54(0 f 1 1 
gp(t-l) Aori(t-l) _ -ap(t-l) . 


P-rl 


1 


Forward filter: 

14) ap(t) = ap(t - l)-'/p(t - l)(e£(<) + M* - l)pZp{t)) g p(t - 1) 


P + 4 


- 


Dual Kalman g 


;ain: 

= gP+l,t ~ Sp+u i 


P 




Backward filter: 

16) bp(t) = bp(t - 1) - 7p(t)(e 6 p(r) + pl P (t))gp(t) 


i> + 3 


- 


Computation of dual Kalman gain for SFNTF: 
17) see description of Version 1), 2) and 3) 


2 




Time update of the SFAEST filter h/»(t): (not required for SFNTF) 

18a) ep(t) =y(t)-h£(t)xp(r) 

19a) ep(t) = ip{t)e P {t) 

20a) hp(r) = hp(t - 1) - ep(r)gp(t) 


P 
1 

P 




Time update of the SFNTF filter h^ >t : 
18b) tff{t) = y(t) - hft^x^t-i 
19b) tff{t) = e,v(t)/7/v(t) 1 
20b) h N .t = h/v.t-i - f/v(0g/v,t 


JV 
iV 


1 


("Total number of multiplications and divisions: 8P + 2N + 39 
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APPENDIX II 
Multiuser detectors for CDMA 

THE MMSE RECEIVER 

Consider the direct sequence spread spectrum system shown 
5 in Fig. 6. 



In it we have M {m = 1 to M) users, each transmitting a 
PN code of length N chips (n = 1 to N) . We assume that all the 
users are both data bit and chip synchronous and that the data 
modulation D m and the codes take the values ±1. The signal 
10 at the input to the receiver at chip n is theref ore : - 

y(n) = X + W(n) 



(1) 



W(n) is the noise sample at chip n. Wiener filter theory 
[1] , [15] states that the optimal set of weights for an FIR 
filter h oot is given by:- 



yy"yx 



{2} 



0 >y is the autocorrelation matrix of the input signal and 
15 for a stationary input:- 



0 = 

->7 



yHn) 
y(n)y{n-\) 
y(n)y(n - 2) 



y(n)y(n-\) 

y 2 (n) 
y(n)y(n-\) 



.yin)y(n-N) y(n)y(n-N+\) 



y(n)y{n-N) 
y(n)y(n-/V+l) 
y{n)y(n-N+2) 



rw 



{3} 



where E[x] denotes the expected value of x. We are interested 
in the FIR filter when it contains no data transitions, ie 
when it only has one data bit from each user in it. Assuming 
that the data modulation on each code is independent, ie 
20 E (D^D b ) =0 , A*B by direct substitution of equation {l} into 
equation {3} or analogy with the spatial radar case [16] , [17] 
it can be shown that 



The matrix Q has dimension NxM and has the codes as its 
columns, ie : - 
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C12 C22 c 32 

C13 C 23 C33 



Cmi 
Cm 

CM3 



"MAJ J 
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Q T denotes the transpose of matrix Q. The matrix A has 
dimensions MxM and has P n at position m along its leading 
diagonal and is zero elsewhere. P m is the power at the 
receiver of user m. For the rest of the section we shall 
5 assume P m = 1 for all m, ie perfect power control with the 
power, of each user normalised to 1. The a 2 I term is the noise 



term assuming 
uncorrelated. 



the noise power is 



and the noise is 



The vector 0 Ay is given by:- 







' yW T 






y(n-l) 




x(n) 


y(n-2) 









{5} 



10 x{n) is the desired response. We are only interested in 

the response when the FIR filter contains the chips from one 
data bit, ie when n = 0 and at that point the desired response 
is the data bit for the desired code. Assuming the desired 
code is m = 1 then x(n) = D x . Substituting this and equation 

15 {1} into equation {5} and assuming the the data symbols are 
uncorrelated yields:- 



0xv = 



Cn 
Cn 



{6} 



Thus the vector 0 Ay is the desired code. From equations 
{l}, {3} and {6}, and under the assumptions that we have 
stated, the optimum mean squared error performance is only 
2 0 dependent on the code set chosen. The minimum mean squared 
error at the FIR filter output is given by:- 
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We can define the output signal to noise ratio for the 
filter as : - 

Assuming that the channel bandwidth in Hz is equal to the 
chip rate in chips per second then we can use the energy per 
5 bit E b divided by the noise power spectral density N 0 as a 
processing gain independent measure of the signal to noise 
ratio at the input to our FIR filter. In our case : - 

Figs. 7, 8, 9 and 10 show the theoretical and simulated 
performance of the Wiener filter calculated as above for a set 

10 of 7 and 31 chip Gold codes. Fig. 7 shows the theoretical and 
simulated performance of the 7 chip Gold codes in terms of the 
output signal to noise ratio from the Wiener filter. Fig. 7. 
shows that the performance of the optimal filter with 4 users 
is practically the same as the performance of a matched filter 

15 with no MAI. Looking at the data, at 10 dB output signal to 
noise ratio, the difference between the 4 user curve and the 
matched filter curve is around 0.3 dB. From Fig. 7, to have 
7 users in a system with a processing gain of 7 we must have 
an increase of around 1.8 dB in E b /N 0 to achieve the same 

20 performance as a matched filter with no MAI. These figures are 
again measured with 10 dB as the required output signal to 
noise ratio. 

Fig. 8 shows theoretical and simulated performance of the 
31 chip Gold codes in terms of the output signal to noise 

25 ratio from the Wiener filter. With 16 users, we require around 
a 0.1 dB increase in E b /N 0 over the matched filter with no 
MAI. This is better than the 4 user, 7 chip case, partly 
because 4/7 is greater than 16/31. When there are 31 users in 
a 31 chip system, the required increase in E b /N 0 is 1 dB over 

30 the matched filter with no MAI. It would appear that the 31 
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chip system enjoys some inherent advantage over the 7 chip 
system, probably due to the greater degree of statistical 
orthogonality between the 31 chip codes when compared with the 
7 chip case. However, other factors such as the choice of the 
5 desired user or the choice of the code sets cannot be ruled 
out . 

Figs. 9 and 10 show the simulated bit error rate (BER) 
performance of the 7 and 3.1 chip codes . These graphs show 
similar results to Figs. 7 and 8 in terms of the increase in 
10 E b /N 0 required to accommodate more users. A good approximation 
to Figs. 9 and 10 can be derived from Figs. 7 and 8 assuming 
that because of the central limit theorem the noise output 
from the Wiener filters is Gaussian. 

Adaptive filter receivers 

15 In most cases the DS-CDMA channel is non- stationary and 

as we have already stated the MAI will vary with the birth and 
death of signals. Therefore the FIR filter used in our 
receiver has to be a time varying approximation to the Wiener 
filter calculated using an adaptive algorithm. We will 

20 consider the properties* of the two most popular adaptive 
algorithms, least mean squares (LMS) and recursive least 
squares (RLS) [1] . We shall use the 7 chip Gold code case as 
our example, as the 31 chip case would involve considerable 
extra computation. Fig. 11 shows the convergence properties 

25 of the LMS algorithm with two different values of the adaption 
parameter ji. The graphs show the feedback error ensemble 
summed over 100 independent trials. With fx = 0.007 the 
algorithm converges to 20% of its initial value within 
approximately 20 data bits, but with \x = 0.0007 convergence 

30 takes around 100 data bits. In a DS-CDMA cellular system, 
where the channel varies relatively quickly with respect to 
the data rate of a single user, this slow convergence will not 
be acceptable. Fig. 12 shows the BER performance of the LMS 
algorithm after convergence with the same values of \x as for 

35 Fig. 11. It shows that the smaller value of \i produces a good 
approximation to the Wiener filter, but the performance of the 



WO 00/51260 



PCT/GB00/00649 



- 30 - Appendix II 

LMS algorithm with the larger value of \i is around 2 dB worse 
in terms of the required signal to noise ratio for a bit error 
rate of 1CT 3 . Thus it is difficult to find a value of \i which 
will provide a fast enough convergence without introducing a 
5 large residual error into the performance of the filter, even 
for the 7 chip case. The convergence time will be greater if 
the codes are longer. If we consider a DS-CDMA cellular system 
with imperfect power control and a multipath channel, then the 
eigenvalue spread of the input signal will be increased. This 

10 will also have an adverse effect on the convergence time of 
the LMS algorithm. More evidence of this is contained in [18] . 
Thus the LMS algorithm is unsuitable for this application. 
Also shown in Fig 11 is the convergence of an RLS algorithm 
with X = 1 and the diagonal elements of the autocorrelation 

15 matrix initially set to 0.001. This shows typical behaviour 
for an RLS algorithm, rapid convergence by 2N. The RLS curve 
in Fig. 12 shows that the RLS algorithm does not add any 
significant residual error to the Wiener filter solution. Thus 
the RLS algorithm is potentially more suited to a DSCDMA 

20 cellular system. 

Discussion 

In this section we shall look at the assumptions made in 
the derivation of the filters and look at applying this type 
of receiver to a cellular system. 

25 We have assumed that all the channels are data bit and 

chip synchronous. If the channels are not chip synchronous, 
provided the receiver samples the desired channel 
synchronously, any change is likely to be beneficial as the 
effective MAI will be reduced. If the channels are not data 

30 bit synchronous however, the effect is greatly dethmental. 
With transitions of the interfering channels occurring in the 
middle of the desired channel data bits, each interfering 
channel requires two eigenvectors in the Q matrix. Thus 
performance is reduced and the system breaks down when the 

35 number of users exceeds 0.52V instead of N for the data bit 
synchronous case. An interesting case is when the data bits 
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are almost synchronous, for example as synchronous as the 
transmission delays in the system will allow. In this case, 
it may be possible to ensure that any transition occurs a 
small number of chips (compared with N) from the beginning or 
5 end of the desired user's data bit. In this case the 
detrimental effect may not be so great, although this 
hypothesis remains unproven. We have also assumed perfect 
power control. There is some evidence that this type of 
adaptive filter structure is tolerant to variations in the 

10 power of the received codes [19] . The last assumption we made 
is that the data bits are uncorrelated. Provided that the 
signals are independently generated, idle channels are 
suppressed and that care is taken over the content and timing 
of control information, this should be a reasonable 

15 assumption. 

If we are to apply this receiver structure to a cellular 
DS-CDMA system we need to take into account the non-stationary 
multipath channel and the birth and death of users. To take 
into account the multipath channel, we can use the technique 

20 in [20] , replacing the impulse response of the spreading 
process with B^, where is the convolution of the 

impulse response of the channel with the impulse response of 
the spreading process and repeat the analysis. This paper 
already describes the theoretical optimal with multipath 

25 evaluated for all users of the system simultaneously. The 
non- stationary elements of the channel can be taken into 
account by blocks of training data with or without decision 
feedback in between. The birth and death of signals problem 
will be greatly alleviated if all users are constrained to 

3 0 switch on and off only immediately proceeding a training 
block. The convergence and tracking properties of the adaptive 
algorithm and speed at which the channel varies will determine 
the length and frequency of the training blocks. However, by 
employing the RLS algorithm (or the covariance form of the 

35 least squares algorithm on training blocks [1] ) , the ratio of 
training data to information carrying data can be kept 
reasonable . 
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Some desirable properties of this receiver are that it 
does not require to know the desired or the interferers 
spreading codes, provided the training data is known and its 
adaptive nature will allow it to reduce the effect of strong 
5 interferers from neighbouring cells and local narrow band 
interference. 

Conclusions 

We have shown that an FIR filter can be used to separate 
a desired signal from MAI in a DS-CDMA system with only a 
10 small degradation in Gaussian noise performance. These filters 
can be approximated by trained adaptive filters. These 
sub-optimal filters may make practical receivers for a DS-CDMA 
cellular system. 

THE DECORRELATING RECEIVER 

15 The decorrelating detector is similar to the MMSE 

detector except that the noise term is not taken into account 
ie. the second term in equation {4} above is neglected in the 
formulae for 0^,. 

RADIAL BASIS FUNCTION RECEIVERS 

20 Introduction; In many DS-CDMA communications systems 

there are three sources of distortion when the signal arrives 
at the receiver, structured multiple access interference (MAI) 
from other users in the system, Gaussian noise which can often 
be extended to include unstructured interference and time 

25 dispersion due to multipath propagation. The simplest DS-CDMA 
receiver structures are based on matched filters for a 
non-dispersive channel and RAKE receivers for multipath 
channels. The MAI performance of matched filter/RAKE receivers 
can be enhanced by applying cancellation at the expense of 

30 increased receiver complexity [12] . Many proposed receiver 
structures are based on linear equaliser structures. Examples 
include decorrelating receivers which are based on the 
zero- forcing equaliser and those based on the MMSE equaliser 
[21] . We shall compare the above receiver structures with the 

35 RBF receiver. This nonlinear receiver minimises the 
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probability of error when deciding on a data bit. It has 
already been shown that an RBF equaliser has superior 
performance to a linear equaliser in multipath channels for 
conventional minimum bandwidth signalling [22] and that an RBF 
5 filter has superior performance to a MMSE filter in a 
non-dispersive CDMA system [23] . We shall show that the RBF 
based receivers have the ability to considerably increase the 
capacity of a DS-CDMA system when compared with other receiver 
structures. 

10 AWGN channel system model: To enable us to compare the 

wide variety of receivers discussed above we shall first 
consider an additive white Gaussian noise channel. Our system 
will consist of U independent users each transmitting a 
DS-CDMA signal which is chip and bit synchronous and with all 

15 users transmitting equal power normalised to 1. The data bit 
transmitted by user u during bit time k will be denoted by 
D u (k) and the spreading code for user u by C Uin . We will use n 
to denote the chip number within the code which is an integer 
between 0 and {N-l) where N is the spreading sequence length 

20 (processing gain) . The spreading codes used throughout are 7 
chips long and randomly generated. Without loss of generality 
we can assume that the desired user is user 0 . The channel 
model will be a simple additive white Gaussian noise (AWGN) 
model with the noise time series denoted by G(kN + i) . Thus 

25 the chip rate signal arriving at the receiver, denoted Y(kN 
+ i ) , will be : - 

t/-i 

Y(kN + 0 = Z D u (k)C uJ + G(kN + () 

at the point where data bit k chip i is received. In the AWGN 
case there is no need to consider i outside the range 0zi<N 
as outside this time the signal will contain no useful 
30 information relating to data bit Jc. 

AWGN channel receiver structures ; We shall consider three 
main receiver structures, matched filtering, the MMSE receiver 
and the RBF receiver. In all cases we shall assume that the 
codes and the signal powers are known at the receiver. In the 
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non-dispersive case the matched filter receiver is given by 
an N tap FIR filter whose co-efficients H i are the spreading 
code for the desired receiver i.e. 



The MMSE receiver is also an N tap FIR filter, but the 
5 co-efficients of this filter are given (in vector form) by 
[24] :- 



*yy is the autocorrelation matrix of the cyclostationary input 
signal with dimensions NxN. is the cross correlation 

vector. The RBF receiver is a nonlinear filter whose estimate 
10 of the data output is given by:- 



where y(/c) denotes the input vector consisting of the N chip 
spaced input samples at data bit time k and c ; are the 
centres. The centres are the noise free input vectors for all 
possible input data bit combinations. There are therefore n c 

15 = 2 U centres. j.| denotes the length of the enclosed vector 
(Euclidean norm) . a is the standard deviation of the noise. 
w £ is the value of D 0 associated with centre c, . In our 
simulations we shall assume that the centres and weights are 
known at the receiver. For all three receiver cases we shall 

20 also consider supplementing the receiver with a single stage 
of parallel cancellation similar to [12] . 

AWGN channel simulation results: The receiver structures 
above were simulated using Monte-Carlo simulation. The graphs 
show the log 10 of the BER averaged over all active users in 

25 the system plotted against the number of active users. Fig. 
13 shows results for E b /N 0 = 9 dB . These figures show that the 
performance of matched filtering becomes poor as the MAI (the 
number of active users) increases. The number of active users 
which gives an acceptable BER for a matched filter receiver 

30 improves considerably with the addition of cancellation, as 



Hi = Co, 
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this filter is clearly interference limited. The MMSE receiver 
performs better than both the matched filter and the matched 
filter with cancellation. It too improves with the addition 
of cancellation, although the improvement is not as great as 
5 adding cancellation to a matched filter. The RBF receiver is 
considerably better than both the matched filter and the MMSE 
receiver, with or without cancellation. However, the RBF 
receiver is not improved by the addition of cancellation. This 
is because in this scenario the RBF is the maximum likelihood 
10 receiver and therefore cancellation cannot improve its 
performance. 

Multipath channel system model: We shall now consider a 
stationary multipath channel model. The channel we shall 
consider has impulse response :- 
15 M(z) = 0.3482 + 0.8704Z' 1 + 0.3482z" 2 

All signals are assumed to pass through the same channel, as 
would be the case in the downlink of a mobile radio system. 
The received signal at the time data bit k, chip i is received 
by the direct path and therefore becomes : - 



20 Note that if i-x is negative then the D u {k)C UtimX is replaced by 
D.,{k-1) C UiJfri . x and if i or i-x is greater than N (if we extend 
the receiver filter length beyond N) then D u {k)C UtimX is 
replaced by D u (k+1) C U<1 . N . X . This is a direct illustration of ISI 
caused by the multipath channel. 

25 Multipath channel receiver structures : We shall base all 

our new receiver structures on a new set of input signal 
samples y(kN+i) where the range of i is extended to 0ssi<N+2 
to capture all the signal energy that originated from data bit 
k. This combined with the multipath channel will result in ISI 

30 from both the preceding and following data bit. The equivalent 
of the matched filter for a multipath channel is an FIR filter 




0. 8704 2 + G{kN + < - 1) + 0. 3432 £ D u (k)C u ^ 2 + G(kN + i - 2) 
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The * represents' convolution and c 0 and M represent the 
spreading code of user 0 and impulse response of the channel 
respectively. The MMSE receiver coefficients are given by the 
5 same equation as before, however, the elements of the 
( (N+2) x (N+2) ) autocorrelation matrix $ and the (W+2) cross 
correlation vector must be calculated according to the 
derivation of the MMSE receiver taking into account the 
multipath. The RBF receiver equation is also essentially 

10 unchanged, except now the input and centre vectors are of 
length N+2 and the number of centres is now 2 2U as all 
possible combinations of the previous, current and next data 
bit must be considered. This number of centres increases 
rapidly with the number of active users U and the 

15 computational load will rapidly become impractical. 

Multipath channel simulation results: Graphs for the BER 
performance of the receiver structures described in the 
previous section are shown in Fig. 14 for E b /N 0 = 9 dB. These 
graphs show similar trends to the nondispersive case. The RAKE 

20 receiver rapidly breaks down as the MAI interference increases 
because the interfering users are not taken account of at all 
in the RAKE receiver and are therefore treated as unstructured 
noise. The MMSE receiver does considerably better than the 
RAKE receiver. The RBF receiver performs better than either 

25 the RAKE receiver or the MMSE receiver. However, it does 
involve a considerable increase in computational complexity. 
The RBF results are truncated at 5 users because the 
calculation of 6 users would require evaluating the Euclidean 
distance from 2 18 centres for every data bit sent. 

30 Discussion and Conclusions : We have shown that an RBF 

receiver has greatly improved performance over the more 
conventional matched filter and MMSE based receiver 
structures. This performance increase is apparent in both AWGN 
and multipath channels. However the computational complexity 



WO 00/51260 

- 36 - 

with N+2 taps where the taps are given by: 



WO 00/51260 



PCT/GBOO/00649 



- 37 - Appendix II 

and the number of variable parameters in a non- stationary 
environment presently make the RBF filter receiver impractical 
for mobile radio applications. 

NEAR OPTIMUM RECEIVERS 

5 Near Optimum receivers use similar methods to the Radial 

Basis Function receiver, usually with simplifications to the 
Radial Basis Function by dropping the exponential term and 
simplifying the Euclidean difference expression. These 
receivers can be implemented at the chip level or the bit 
10 level (after preprocessing) and often involve the Viterbi 
forward programming algorithm to search for the optimum centre 
(path in the Viterbi algorithm) . 

CANCELLATION BASED RECEIVERS 

The optimal receiver for a direct sequence code division 
15 multiple access (DS-CDMA) system is the maximum likelihood 
sequence estimator [25] [14] which is effectively an RBF 
receiver with infinite memory. However, this receiver's 
complexity rises exponentially with the number of users and 
is therefore impractical. At the other end of the complexity 
20 scale, a matched filter is commonly used as the receiver in 
a DS-CDMA system. A matched filter is the optimal receiver 
only in additive white Gaussian noise (AWGN) and has very poor 
performance when the level of multiple access interference 
(MAI) from other users in the system is high. 

25 Between these two extremes a range of sub-optimal 

receiver structures have been proposed. The majority of these 
can be split into two types. Cancellation structures are 
typically based on matched filtering followed by subtraction 
of the matched filter output from a delayed version of the 

30 input signal [12] [26] [27] . Equaliser structures use a linear 
or decision feedback FIR filter with the co-efficients 
optimised according to varying criterion such as minimum mean 
square error (MMSE) or zero-forcing [18] [20] [28] . A 
comprehensive review of this area and a reference list can be 

35 found in [21] . 
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In this section the performance of a Wiener filter 
receiver is compared with that of a simple parallel canceller. 
The combination of the two techniques is also examined. 

System description 

5 The system to be considered consists of U independent 

equal power users transmitting both data bit and chip 
synchronously as shown in Fig. 15. The data bit transmitted 
by user u at bit time k will be denoted by D u (k) and the 
spreading code for user u will be denoted by C u>n . n denotes 

10 the chip within the code which is an integer between 0 and 
(N-l) and N is the spreading sequence length (processing 
gain) . The spreading codes used as an example throughout this 
paper are 64 chips long and randomly generated. Orthogonal 
(Walsh) or semi -orthogonal (Gold) codes would give better 

15 performance, but would require longer simulation times to give 
meaningful BER results. In many communications systems the 
channel will destroy the orthogonality property of orthogonal 
codes anyway. Throughout this paper the chip and data values 
will be normalised to ±1. Without loss of generality we can 

20 assume that the desired user is user 0. The channel under 
consideration will be a simple AWGN channel with the noise 
denoted by G(kN+n) having variance a 2 . Thus the chip rate 
signal arriving at the receiver, denoted Y(kN+n) , will be:- 

Y(kN + n) = S D u (k)C u , n + G(kN + n) 

Receiver structures and theoretical performance 

25 The four receiver structures shown in Fig. 16 will be 

examined, Receiver structure a) is the simple matched filter 
case. The receiver consists of an N tap FIR filter whose 
co-efficients h n are a scaled version of the desired user's 
code, ie:- 




30 The matched filter treats the MAI as noise and therefore, if 
the signal power of each user is one and the codes are random, 
the central limit theorem allows us to approximate the BER for 
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this receiver as : - 



BER mf = erfc 




<j 2 + (U-\) 



N 



J 



(1) 



where erfc() is the complimentary error function. 

Receiver structure b) is a simple parallel canceller 
using matched filters to despread each interfering signal and 
5 reconstruct a replica which is subtracted from an 
appropriately delayed input signal, cancelling much of the 
interference. The remaining signal is then despread using a 
matched filter to the desired user. The reconstruction of the 
signal assumes that the power of the signal is known exactly. 
10 Therefore an interfering signal which is is received with the 
correct sign will be cancelled exactly and an interfering 
signal received incorrectly will have its amplitude doubled 
(power quadrupled) . A lower bound on the BER for this 
structure is given by:- 



15 This equation is only a lower bound for two reasons . Firstly 
the noise samples for each matched filter in the cancellation 
stage are the same, whereas the above equation assumes they 
are independent samples. Secondly, after cancellation the 
combined interference plus noise distribution of the remaining 

20 signal is a poor approximation to Gaussian because it consists 
of the Gaussian noise and a limited number of enlarged 
interferering signals which will be a poor approximation to 
a Gaussian distribution. 

Receiver structure c) is the Wiener, Levinson or MMSE 
25 producing filter [15] [1] . It is also an N tap FIR filter, but 
the co-efficients of this filter are given (in vector form) 




(2) 



by [24] : - 




where 
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The T superscript denotes a matrix transpose. is the 

autocorrelation matrix of the input signal with dimensions 
NxN, given by : - 

<Pyy=QQ T + <7 2 l 

The matrix Q has dimensions NxU and has the codes as its 
5 columns, ie:- 



Cl.N-i ^2,^-1 - • 

I is the NxN identity matrix. ± yx is the cross correlation 
vector : - 

Qyx = [C 0 ,o» C 0tl , C 02 • C 0>AM ] r 

The MMSE is given by:- 

MMSE = l-h T <b yx 
and the BER performance (derivation A) is:- 



C 0,0 


Cl.Q 


C2.0 


Q/-1.0 


C<U 








Cq.2 






Q/-L2 



BER Wiener - erfc 



\ [-(ft 7 *,*) 2 + (2. 0)ft r $^ -1.0 + AfAfSE 



(3) 



10 To obtain avera'ge BER performance, this equation must be 
averaged over all the users in the system to give BER wiKS9r . 



Receiver structure d) is a parallel canceller using 

Wiener filters for the initial estimate of the interferers 

data bits. By analogy with equation 2, a lower bound on the 

15 BER performance of this structure is given by:- 

( I : — ^ 

BER = crfc A / - — — — |X (4) 

[ \ 0* + BER mm tf.0W - I) ) 

Results 

All simulated results in this section are for 60,000 data 
bits, with the BER averaged over all users. Graphs show the 
average BER for all users in the system plotted against number 
20 of users. The sequence length is 64. The signal to Gaussian 
noise ratio is exoressed as : - 

E b /N 0 - — 

Fig. 17 shows a comparison between the theoretical 
results derived in above and simulation results. The results 
show that equations (1) and (3) give good and unbiased 
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estimates of the simulated results for the matched filter and 
Wiener . filter respectively. The deviation of the simulated 
results for the matched filter from equation (1) with a low 
number of users is caused by the interference distribution 
5 only being approximately Gaussian and the limited number of 
simulated points. Equations (2) and (4) however only give an 
approximation to the actual performance of the two 
cancellation receivers. This estimate is biased towards a 
lower BER for reasons that' were discussed above. 

10 Fig. 18 shows simulated results for a range of signal to 

noise ratios. These results show that a matched filter is only 
a good receiver structure when Gaussian noise is the dominant 
source of interference. This only applies when there are very 
few interfering users and the signal to noise ratio is low. 

15 A single stage of parallel cancellation based on matched 

filters performs better than a matched filter except at 
extremely high levels of interference, where the BER is too 
low for most communications systems anyway. The improvement 
is fairly significant, with E b /N 0 = 9 dB, the cancellation 

2 0 receiver can support three times as many users as the matched 
filter if the required BER is 10" 2 . 

The Wiener filter will always perform better than the 
matched filter. The only case where the performance of these 
two receiver systems is the same is when there is no MAI 

25 interference and the Wiener filter becomes a scaled version 
of the matched filter. The Wiener filter also performs 
significantly better than the matched filter based 
cancellation receiver. With E b /N 0 = 9 dB and a required BER of 
10" 2 , the Wiener filter will allow approximately six times as 

30 many users as a matched filter and twice as many as the 
matched filter based cancellation receiver. The computational 
complexity of the Wiener filter receiver is identical to the 
matched filter receiver, and significantly less than the 
matched filter based cancellation receiver. This assumes that 

35 the Wiener co-efficients are calculated in advance. 
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The Wiener filter based cancellation receiver gives the 
best performance of all. It will allow approximately seven 
times the number of users if EjN 0 = 9 dB and the required BER 
is 1CT 2 , compared with the matched filter receiver. 

5 Note that in all cases there is a significant variation 

in performance between users. For example, using the Wiener 
filter receiver with 48 users and E b /N 0 = 9 dB, the average 
BER is 0.00940 but. the user with the best spreading sequence 
experiences a BER of 0.001167 and the user with the worst 
10 spreading sequence experiences a BER of 0.038092. This 
variation is due to the cross correlation properties of the 
code set and will also apply to an orthogonal code set which 
has lost its orthogonality due to a multipath channel. 

Discussion and Conclusions 

15 In this paper we have shown that cancellation and Wiener 

filtering both perform much better in MAI than a simple DMF. 
Wiener filters in particular offer a large increase in the 
number of users that can be accommodated. The two ideas can 
be successfully combined and provide a better performance than 

20 either on its own. 

There is a cost to pay for improved performance. The 
cancellation receiver introduces a delay of one data bit and 
requires a substantial increase in either computation or 
hardware. Both the Wiener filter and the cancellation receiver 

25 require knowledge of the number and spreading sequences of the 
interfering users at the receiver, and the Wiener filter also 
requires a priori knowledge of the signal to noise ratio. If 
the signal to noise ratio is very high, calculation of the 
inverse of the autocorrelation matrix for the Wiener filter 

30 can be difficult as it becomes close to singular. In a 
computer network application the calculation of the Wiener 
filter in advance may be feasible, but in a mobile 
communications application the signal to noise ratio and the 
number of users varies rapidly with time. However, a good 

35 approximation to the Wiener filter can often be obtained using 
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an adaptive algorithm. 
Derivation A 

To calculate" the BER for the Wiener filter we require to 
find a relationship between the MMSE and the variance at the 
5 output of the filter. 

MMSE = E[(D 0 (k) - x) 2 ) - E[D 2 0 (k)} - (2. 0)E[D 0 (k)x) + El* 2 ) 
where E[] denotes the expected value and x denotes the filter 
output. D 0 (k) takes the values ±1, therefore :- 

MMSE = 1 - (2. Q)h T &, x + E[x 2 ] [A.1] 

The variance of the filter output is given by:- 

™r(x) = E[(x - x) 2 ) = E[x 2 ] - (2. 0)£[Jc*] + E[x 2 ) 
where x denotes E[x] . 

var(x) = {h T <b yx T - (2. 0)(h T <t> yx ) 2 + E[x 2 } [A.2J 

10 Subtracting [A.l] from [A. 2] and rearranging gives:- 
var{x) = - (h T <p yx ) 2 + (2. 0)h T Q yx - 1-0 + MMSE 
Therefore, assuming the output of the Wiener filter has a 
Gaussian distribution, the BER is given by:- 



BER Wiener = erfc 



= erf L \[ <* r ^> 2 ) 

\jy var(x) Jj |^ \j ^(h T O yx )^ + (2.O)h T 0 yx -l,O^MMSE^ 



VOLTERRA RECEIVER 

The Volterra receiver is made up by a power series 
15 expansion of the received signal followed by a linear filter 
such as MMSE where the Volterra expanded signal v is used 
instead of the original received signal y. For a DS-CDMA 
system with received input signal y(kN+n) , the output of the 
third order Volterra expansion is given by:- 

D = ^Z h x (a)y{kN + n - a) + £ Z M*. + n - a)^(/:/V fn-^) 

//-I //-i A^-l 

+ 2 2 S *3(<*, c)y(WV + * - a)>(iiV + n - i)y(JUV + n - c) 

a*0 fr=0 c=0 

20 where the h cb-ef f icients are estimated using a linear 
technique such as MMSE with the autocorrelation and 
crosscorrelation matrices replaced with the autocorrelation 
and crosscorrelation matrices of the power series expanded 
matrices . 



*1 
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CLAIMS 

1. A direct sequence code division multiple access 
receiver comprising an adaptive filter controlled by an 
adaptive algorithm for filtering data which has been 

5 multiplied by a spreading code and filtered by a channel 
filter, the adaptive filter having a length appropriate to 
model the inverse of the channel filter, and a multiuser 
detector operating on the Output of the adaptive filter. 

2. A receiver according to claim 1, wherein the 
10 algorithm is trained using the signal of a desired user. 

3. A receiver according to claim 1 or 2, wherein the 
algorithm is trained using a composite signal from more than 
one user . 

4. A receiver according to claim l, 2 or 3, wherein 
15 the multiuser detector is of the minimum mean squared error 

type. 

5. A receiver according to claim 1, 2 or 3, wherein 
the multiuser detector is of the zero forcing 
(decorrelating) type. 

20 6. A receiver according to claim 1, 2 or 3, wherein 

the multiuser detector is of the Volterra type. 

7. A receiver according to claim 1, 2 or 3, wherein 
the multiuser detector is of the Radial Basis Function type. 

8. A receiver according to claim 1, 2 or 3, wherein 
25 the multiuser detector is of the cancellation type. 



9. A receiver according to claim 1, 2 or 3, wherein 
the multiuser detector is of the near optimum decoding type. 
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10 1 A receiver according to any preceding claim, 
wherein the algorithm can comprises the least mean squares 
algorithm. 

11. A receiver according to any one of claims 1 to 9, 
5 wherein the algorithm comprises the recursive least squares 

algorithm. 

12. A receiver according to any one of claims 1 to 9, 
wherein the algorithm comprises the fast a-posterion error 
sequential technique algorithm. 

10 13 . A receiver according to any one of claims 1 to 9, 

wherein the algorithm comprises the stabilised fast a- 
posterion error sequential technique algorithm. 

14. A receiver according to claim 12 or 13, wherein 
said algorithm is used in combination with the Fast Newton 
15 algorithm. 
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